Foundations of Trusted Autonomy
Tags: #technology #ai #robotics #ethics #trust #philosophy #cybersecurity #defense
Authors: Hussein A. Abbass, Jason Scholz, Darryn J. Reid
Overview
This book examines the foundations of trusted autonomy, aiming to provide a comprehensive overview of the key philosophical, scientific, mathematical, and practical considerations for building autonomous systems that can be trusted. We bring together diverse perspectives from researchers and practitioners across a range of disciplines, including artificial intelligence, robotics, computer science, cognitive science, philosophy, cyber security, defense, and space operations. Our target audience is broad, encompassing scientists, researchers, practitioners, technologists, and graduate students seeking to understand the complex landscape of trusted autonomy. We are motivated by the increasing deployment of autonomous systems in critical domains, and the need to ensure that these systems are reliable, safe, and aligned with human values. This book is particularly relevant to the growing debate about the societal implications of artificial intelligence and robotics, and the need to develop technologies that can be trusted to operate responsibly. We contribute to this debate by exploring the fundamental nature of autonomy, the meaning of trust, and the factors that influence trust in human-machine interactions. We argue that trusted autonomy is a prerequisite for successfully integrating autonomous systems into human society, and that achieving this goal will require a deep understanding of the technical, social, and ethical challenges involved. Ultimately, we strive to pave the way for the development of autonomous systems that are not only capable and intelligent, but also worthy of our trust.
Book Outline
2. Universal Artificial Intelligence: Practical Agents and Fundamental Challenges
This chapter introduces the concept of Universal Artificial Intelligence (UAI) as a theoretical framework to understand and design practical artificial intelligence systems. The chapter opens with an overview of the scientific study of intelligence, highlighting the shift from deductive reasoning to inductive approaches that excel at handling real-world uncertainty and complexity. We present the concept of UAI as comprised of four components: framework, learning, goal, and planning.
Key concept: “The “will” that underpins the intent of this agent is “maximisation of reward””
3. Goal Reasoning and Trusted Autonomy
This chapter explores the concept of Goal Reasoning in the context of Trusted Autonomy. Goal Reasoning allows autonomous agents to reason about and dynamically adapt their goals, enabling them to better respond to unexpected events or changing circumstances. We present two models of Goal Reasoning: Goal-Driven Autonomy and the Goal Lifecycle. We also introduce the Situated Decision Process (SDP), a framework for managing and executing goals in a team of autonomous vehicles.
Key concept: “So goals are important computationally to achieve practical systems.”
4. Social Planning for Trusted Autonomy
This chapter examines the role of social planning for trusted autonomy. Unlike classical planning which struggles with the complexities of multi-agent settings, social planning allows artificial agents to reason about the mental states of human collaborators and make plans that are more intuitive and explainable from the human perspective. We outline challenges in multi-agent planning, such as nested beliefs, and present a formal model for multi-agent epistemic planning. We also discuss two case studies of semi-autonomous systems utilizing social planning: one in a search and rescue mission, and the other in a collaborative manufacturing task.
Key concept: “Social planning is machine planning in which the planning agent maintains and reasons with an explicit model of the humans with which it interacts, including the human’s goals, intentions, beliefs, as well as their potential behaviours.”
5. A Neuroevolutionary Approach to Adaptive Multi-agent Teams
This chapter explores an approach to multi-agent systems called the Adaptive Team of Agents (ATA). ATAs are homogeneous teams, meaning all agents are capable of performing any sub-task, but they self-organize a division of labor in situ to adapt to changes in their environment. This adaptability and flexibility makes them less brittle than heterogeneous teams. We investigate ATAs using neuroevolution to train artificial neural networks to control the agents in a simple strategy game called Legion II.
Key concept: “An ATA is a homogeneous team that self-organizes a division of labor in situ so that it behaves as if it were a heterogeneous team.”
6. The Blessing and Curse of Emergence in Swarm Intelligence Systems
This chapter delves into the blessings and curses of emergent behavior in swarm intelligence systems. Emergent behavior arises from the interaction of many relatively simple individual agents without centralized control, producing complex, coordinated group behavior. The advantages of emergence include simplicity, robustness, flexibility, environment integration, scalability, autonomy, and parallelism. However, emergence poses challenges to predictability and controllability, particularly when applied to trusted autonomous systems.
Key concept: “A system exhibits emergence when there are coherent emergents at the macro-level that dynamically arise from the interactions between the parts at the micro-level. Such emergents are novel w.r.t. the individual parts of the system””
7. Trusted Autonomous Game Play
This chapter looks at the importance of trusted autonomy in the design and play of digital games. We discuss the four defining traits of games: goals, rules, feedback, and voluntary participation, in relation to trust and autonomy. We argue that Trusted Autonomy could play a key role in game AI, making AI opponents and teammates more responsive, intelligent, and believable. We also envision Trusted Autonomous games that adapt to the player’s playstyle and preferences, and Trusted Autonomous game communities that promote fair play and a safe environment for all players.
Key concept: “When you strip away the genre differences and the technological complexities, all games share four defining traits: a goal, rules, a feedback system, and voluntary participation.”
8. The Role of Trust in Human-Robot Interaction
This chapter examines the complex role of trust in Human-Robot Interaction (HRI). We highlight that human reliance on robots hinges on their ability to meet expectations. We explore various models of trust, emphasizing the dynamic nature of trust relations and the factors that influence trust, such as system properties like reliability, predictability, and transparency. We delve into the differences and similarities between human trust in other humans versus human trust in robots, and discuss existing instruments for measuring trust in automation and robots.
Key concept: “…an attitude which includes the belief that the collaborator will perform as expected, and can, within the limits of the designer’s intentions, be relied on to achieve the design goals””
9. Trustworthiness of Autonomous Systems
This chapter delves into the trustworthiness of autonomous systems, arguing that effective robots and autonomous systems must be perceived as trustworthy by users. We explore the concept of trustworthiness from various perspectives, including philosophical considerations of trustworthiness as a property and the factors that influence trust, such as reliability, competence, and integrity. We present a model of trust based on two primary dimensions: competence and integrity, and discuss how these dimensions relate to different levels of trust across various applications, including high-risk situations such as autonomous weapons systems.
Key concept: “The management two-component model of trust differentiates competence-consisting of skills, reliability and experience-and integrity-consisting of motives, honesty and character”
10. Trusted Autonomy Under Uncertainty
This chapter explores the concept of trusted autonomy in relation to uncertainty, emphasizing the importance of understanding the connections between trust, distrust, and uncertainty. Trusting an autonomous system involves relinquishing some control and accepting unknowns. We discuss different kinds of uncertainty, including probabilistic uncertainty, ambiguity, conflict, and sample space ignorance, and their implications for trust in human-robot interaction (HRI). We also discuss how social dilemmas, such as the ‘driverless car dilemma,’ can pose a challenge to human trust in autonomous agents that are programmed to be rational.
Key concept: “Relinquishment of knowledge and control is primarily what distinguishes trust relationships from contracts (or assurance).”
11. The Need for Trusted Autonomy in Military Cyber Security
This chapter explores the urgent need for trusted autonomy in military cyber security, driven by the exponentially growing volume and sophistication of cyber threats. We present four fundamental principles for cyber security, including the importance of a secure foundation and recognizing that everyone in the organization contributes to cyber security. We then outline the key challenges for military cyber security, framed in terms of five ‘V’s: volume, visualization, velocity, variety, and variability.
Key concept: “But no matter how good these systems are, cyber security can only be as good as the organisation’s people make it.”
12. Reinforcing Trust in Autonomous Systems: A Quantum Cognitive Approach
This chapter proposes using quantum cognition to model human biases and judgment errors in order to improve communication and trust between humans and autonomous systems. We argue that incorporating such models into autonomous systems could help the user avoid mistakes and thereby prevent an erosion of trust. As a key example, we present the conjunction fallacy, where humans judge the probability of a conjunction of events to be higher than a single event, and provide an explanation for this phenomenon using the framework of quantum cognition.
Key concept: “In the foreseeable future, humans and autonomous systems will engage in shared decision making. Given the discrepancies between the way they arrive at decisions, whose form of rationality should be given precedence: human or machine?”
13. Learning to Shape Errors with a Confusion Objective
This chapter explores the use of a confusion objective function for training neural networks in order to shape the distribution of classification errors, making them more desirable in high consequence situations. We present a derivation of maximum likelihood classifiers to match a target error profile and provide an implementation using Google’s TensorFlow. A series of experiments is conducted to evaluate the effectiveness of this approach for ‘error trading’ and ‘adversarial errors’, with promising results in both areas.
Key concept: “Approaches to CSS include reweighting (stratifying) available training data so that more costly errors will incur a larger overall cost [5], cost-sensitive boosting [13], [4] that combine multiple weak or diverse learners, or by changing the learning algorithm.”
14. Developing Robot Assistants with Communicative Cues for Safe, Fluent HRI
This chapter focuses on the development of autonomous robot assistants for manufacturing, exploring how incorporating natural communicative cues into the design of these systems can lead to safer and more fluent human-robot interaction (HRI). We present an overview of research efforts at the CARIS laboratory at the University of British Columbia, describing the three-phased design process they use for identifying, modeling, and implementing naturalistic communicative cues in robotic systems. We then discuss various studies conducted by CARIS investigating explicit and non-explicit communicative cues, ranging from human-robot handovers, to gaze behaviors and hesitation during interaction, to tap and push style interactions.
Key concept: “These are human behaviors which communicate information to other people. Explicit cues are intentional behaviors performed with the purpose of communication, such as when one performs a hand gesture [6, 8, 9, 21]. Non-explicit cues are unintentionally performed, but broadcast intentions or other important information, such as when a person looks toward where they are about to hand an object to another person [1, 17, 23].”
15. Intrinsic Motivation for Truly Autonomous Agents
This chapter argues that intrinsic motivation, meaning internally generated needs and preferences, is essential for truly autonomous agents operating in complex, uncertain, and unpredictable environments. Intrinsic motivation is crucial because it allows agents to adapt to changing circumstances and unexpected events without relying on pre-programmed responses. We present an overview of different models of intrinsic motivation, including Murray’s basic needs, Reiss’s 16 basic desires, and our own theory as embodied in the Clarion cognitive architecture. We also discuss how understanding the motivation of other agents is central to trust.
Key concept: “In fundamentally unpredictable environments, a key aspect that one can be certain of is stable internal needs and preferences – that is, intrinsic motivation.”
16. Computational Motivation, Autonomy and Trustworthiness: Can We Have It All?
This chapter examines the relationship between computational motivation, autonomy, and trustworthiness in autonomous systems. We argue that intrinsic motivation can improve the functionality of autonomous systems, particularly in complex, uncertain environments. We explore the potential benefits and challenges of intrinsic motivation, including its ability to enhance diversity, adaptation, exploration, and open-ended goal formulation, in the context of intrinsically motivated agent swarms. We then discuss how these properties might impact the perceived trustworthiness of autonomous agents in terms of reliability, privacy, safety, complexity, risk, and free will.
Key concept: “Risk here is the potential of losing something of value, weighed against the potential to gain something of value (an incentive).”
17. Are Autonomous-and-Creative Machines Intrinsically Untrustworthy?
This chapter explores the relationship between autonomy and creativity in artificial agents and poses a theorem (Theorem ACU, TACU) which states, under certain formal assumptions, that an artificial agent that is autonomous (A) and creative (C) will tend to be, from the standpoint of a fully informed rational agent, intrinsically untrustworthy (U). We establish a formal model of this principle (PACU, for the human sphere) using a cognitive calculus (DeCEC), and demonstrate its veracity through a series of simulations using a novel automated theorem proving program (ShadowProver).
Key concept: Theorem ACU: In a collaborative situation involving agents a (as the “trustor”) and a (as the “trustee”), if a is at once both autonomous and ToM-creative, a is untrustworthy from an ideal-observer o’s viewpoint, with respect to the action-goal pair α, γ in question.
18. Trusted Autonomous Command and Control
This chapter examines the future potential for trusted autonomous command and control (C2) systems in military operations. We highlight the increasing complexity and tempo of warfare, particularly in the cyber domain, and the challenges this poses for human decision-making. We argue that autonomous C2 systems may offer significant advantages in such situations by reducing the risk of human error and enabling faster, more efficient responses to threats. We present a series of plausible scenarios, drawing on historical examples of both success and failure in human-based C2, to illustrate the potential benefits and risks of adopting trusted autonomous C2 in future military operations.
Key concept: “The primary intent therefore is to ensure vulnerabilities are minimised and responses are quick, coherent and effective when facing cyber incidents.”
19. Trusted Autonomy in Training: A Future Scenario
This chapter explores the potential implications of trusted autonomous systems in future training and education environments. We examine the ongoing shift from traditional teacher-centered learning towards more student-centered, technology-enabled approaches like Massive Open Online Courses (MOOCs). We argue that trust is a key factor in successful training, and present a system map outlining the key drivers likely to shape the future of trusted autonomy in training: autonomous systems development, training systems development, and trust. We then use the concept of familiarity to develop a theory of how trust in autonomous training systems may evolve over time, and present three narratives illustrating possible future scenarios.
Key concept: “We can speculate, with a fair degree of accuracy, that trust between two entities, is a function of their familiarity. That is, the more familiar you are with someone else, the more likely you are to trust them. Regarding trusted autonomy, this is illustrated as a function over time in Fig. 19.2.”
20. Future Trusted Autonomous Space Scenarios
This chapter presents future scenarios for trusted autonomous space systems. We begin by describing the harsh and remote nature of the space environment, emphasizing the complexities associated with operating in this domain and the challenges posed for traditional approaches to spacecraft operations. We argue that trusted autonomous systems are a desirable application for space missions due to their potential to enhance safety, efficiency, and exploration capabilities. We then outline several future scenarios, including autonomous space operations, autonomous space traffic management systems, and autonomous disaggregated space systems, highlighting the need for trust in enabling these systems to be successfully deployed.
Key concept: “The distance from Earth to assets in space, whether they be in deep space or GEO or even LEO, is such that latency of communications can be an important issue, particularly when timeliness is important - for example, when performing planetary orbit insertions, or avoiding collision events.”
21. An Autonomy Interrogative
This chapter examines the concept of autonomy from an economic perspective, arguing that autonomous decision-making is essentially a matter of allocating scarce resources under uncertainty. We draw on insights from economics, particularly Keynesian economics, to highlight the challenges posed by fundamental uncertainty, which is distinct from stochastic risk and cannot be easily measured or modeled. We propose the concept of ‘plasticity’ to describe the ability of an autonomous agent to countenance unpredictable future states of its environment. We then explore various decision-making strategies, such as barbell strategies, that have been successfully applied in economic settings to handle uncertainty. We argue that these strategies, which focus on avoiding catastrophic losses and making opportunistic bets, may offer valuable insights for designing truly autonomous systems capable of operating in complex, uncertain environments.
Key concept: “The difference between an autonomous agent and an economic one is a matter or emphasis that directs subsequent research problem choices. When we talk of artificial intelligence, it usually means we are mostly interested in developing algorithms and their technical implementations in machines; when we talk of economic agents, it means we are mainly concerned with individual and system-wide outcomes given different mixes of different kinds of interacting resource-allocating agents under various environmental conditions.”
Essential Questions
1. How can we design autonomous systems that are capable of acting effectively in complex, uncertain, and unpredictable environments?
This question probes the fundamental challenge of designing autonomous systems that are capable of acting effectively in complex, uncertain, and unpredictable environments. The book emphasizes the limitations of traditional AI approaches that rely on pre-programmed behaviors or statistical models. It argues that achieving true autonomy requires agents to have internal motivational processes, enabling them to adapt to changing circumstances and unforeseen events. The book highlights the role of intrinsic motivation, drawing inspiration from human psychology and cognitive architectures like Clarion, to guide the development of more robust and flexible autonomous agents.
2. What is the relationship between trust and autonomy, and how can we design trustworthy autonomous systems?
This question explores the intricate relationship between trust and autonomy. Trust is essential for human acceptance and adoption of autonomous systems, especially in high-risk scenarios. The book examines different aspects of trust, including reliability, predictability, transparency, and integrity. It discusses how these factors can be designed into autonomous systems and evaluated by humans, drawing on models from psychology, human-computer interaction, and organizational behavior. It also explores the challenges of maintaining trust in the face of autonomous system failures and the potential for developing systems that can actively cultivate trust.
3. How does social interaction influence trust in autonomous systems, and how can we design systems that are socially intelligent and responsible?
This question examines the role of social interaction in the development of trusted autonomy. The book highlights the importance of designing systems that can understand and respond to human behavior and social cues. It explores topics such as social planning, which involves reasoning about the mental states of human collaborators, and human-robot interaction (HRI), which focuses on designing natural and intuitive interfaces for communication and collaboration. It also touches on the role of reputation models, inverse trust, and rebel agents in navigating the social dynamics of trust in multi-agent systems.
4. What are the implications of emergent behavior for trusted autonomy, and how can we harness its benefits while mitigating its risks?
This question explores the potential benefits and challenges of emergent behavior in autonomous systems, particularly in the context of trust. Emergent behavior arises from the interaction of many individual agents without centralized control, producing complex, coordinated group behavior. The book discusses the advantages of emergence, such as robustness, flexibility, and scalability, but also highlights the challenges it poses for predictability and controllability. It argues that understanding and managing emergence is crucial for building trustworthy autonomous systems, and explores strategies for achieving this goal, including the concept of ‘swarm engineering’.
5. What are the potential applications of Trusted Autonomy across various domains, and how can we leverage its capabilities to address real-world problems?
This question delves into the practical applications of trusted autonomy across various domains. The book provides a number of detailed case studies and future scenarios, ranging from military cyber security and space operations to training and education environments. It examines how trusted autonomy can be leveraged to improve efficiency, safety, and effectiveness in these domains, while addressing the specific challenges and opportunities presented by each context. These case studies highlight the broad applicability of Trusted Autonomy and underscore its potential to transform how we work, learn, and interact with technology.
Key Takeaways
1. Transparency is Essential for Trust
Transparency, meaning the ability of an autonomous system to explain its reasoning and actions to humans, is a crucial factor in fostering trust. By making its decision-making processes clear and understandable, an autonomous system can help users feel more confident in its capabilities and more comfortable relying on it, even in uncertain or unexpected situations. This is especially important in high-risk domains where the consequences of errors could be severe.
Practical Application:
In the development of an autonomous delivery robot, the system could be designed to provide explanations for its actions, such as ‘I am rerouting due to a road closure’, or ‘I am waiting here because the traffic light is red’. This transparency can help users understand the robot’s behavior and build trust in its decision-making capabilities.
2. Homogeneous Agents Offer Greater Adaptability and Robustness
Heterogeneous multi-agent systems, where agents are specialized for specific sub-tasks, can be brittle and inflexible. Homogeneous multi-agent systems, where all agents are capable of performing any sub-task, are more adaptable and robust. Adaptive Teams of Agents (ATAs), which self-organize a division of labor in situ, represent a promising approach to building flexible and resilient multi-agent systems.
Practical Application:
In a military setting, an Adaptive Team of Agents (ATA) could be used to control a group of unmanned aerial vehicles (UAVs). If one UAV is damaged or lost, the remaining UAVs can adapt and reorganize to continue performing the mission, ensuring greater resilience and robustness in operation.
3. Barbell Strategies Provide Robustness in Uncertain Environments
In environments characterized by fundamental uncertainty, where the future is unpredictable and traditional reward-maximization strategies are ineffective, barbell strategies offer a robust approach to decision-making. Barbell strategies involve allocating the majority of resources to conservative, low-risk options to avoid catastrophic losses, while dedicating a smaller portion to high-risk, high-reward opportunities. This approach prioritizes survival and robustness over maximizing expected utility.
Practical Application:
A self-driving car could employ a barbell strategy by allocating most of its computational resources to safety-critical functions, such as obstacle avoidance and emergency braking, while dedicating a smaller portion of its resources to exploring alternative routes or optimising fuel efficiency. This approach prioritizes safety and reliability, while still allowing for exploration and innovation.
Suggested Deep Dive
Chapter: Chapter 2: Universal Artificial Intelligence: Practical Agents and Fundamental Challenges
This chapter provides a solid theoretical foundation for understanding the potential and limitations of intelligent agents, highlighting the fundamental challenges associated with designing truly intelligent and trustworthy systems. It introduces the concept of Universal Artificial Intelligence (UAI) and presents a mathematical framework for designing optimal agents in unknown environments. This chapter sets the stage for understanding the complexities of trust in the context of increasingly capable and autonomous AI.
Memorable Quotes
2.2 Background and History of AI. 17
Ultimately, we wish to build systems that solve problems and act appropriately; whether the systems are inspired by humans or follow philosophical principles is only a secondary concern.
2.3.5 AIXI – Putting It All Together. 26
Having a good understanding of the behaviour and consequences an autonomous system strives towards, is essential for us being able to trust the system.
3.2.3 An Application for Human-Robot Teaming. 53
Goal selection methods based on motivators (or other primitives) allow autonomous agents to make their own decisions in situations their designers did not anticipate. This may be viewed as a step toward greater autonomy, but does not necessarily establish a mutual understanding of trust as a concept [1].
4.2 Motivation and Background. 68
So the work described here, on computational mechanisms for constructing and representing explainable plans in human-agent interactions, addresses one aspect of what it will take to meet the requirements of a trusted autonomous system.
6.2 Emergence in Swarm Intelligence. 119
While the emergent behaviour of swarm intelligence systems has proven useful in solving complex real-world problems, as Parunak notes: “Neither self-organization nor emergence is necessarily good” [15].
Comparative Analysis
This book stands as a singular and timely exploration of the burgeoning field of Trusted Autonomy. While other works in AI often focus on narrow technical aspects, this book takes a more holistic view, exploring the philosophical, social, and practical dimensions of trust in autonomous systems. It shares common ground with texts on AI safety and ethics, such as “Superintelligence” by Nick Bostrom and “Human Compatible” by Stuart Russell, in emphasizing the need for robustly beneficial AI. However, it delves deeper into the specific challenges and opportunities associated with achieving Trusted Autonomy across a wide range of domains, including military operations, cyber security, training, space exploration, and human-robot collaboration. It provides a breadth of perspectives rarely found in other books, making it a valuable resource for anyone seeking to understand the complex landscape of Trusted Autonomy.
Reflection
This book presents a compelling case for the importance of Trusted Autonomy, highlighting the intricate interplay of technical capabilities, social intelligence, and ethical considerations in designing autonomous systems that are worthy of human trust. The authors convincingly argue that as autonomous systems become increasingly prevalent and sophisticated, understanding the nature of trust and the factors that influence it will be crucial for their successful integration into society. The book’s strength lies in its interdisciplinary approach, bringing together diverse perspectives to provide a holistic view of Trusted Autonomy. However, the book’s focus on specific scenarios, while insightful, also leaves open questions about the generalizability of its findings to broader contexts. Moreover, some of the proposed solutions, such as the reliance on reputation models or the development of ‘trust-aware’ machines, may raise further ethical and philosophical dilemmas that require careful consideration. Nevertheless, this book serves as a valuable contribution to the ongoing discussion about Trusted Autonomy, providing a solid foundation for future research and development in this critical field.
Flashcards
What is Autonomy?
The ability and the inclination of an agent to act autonomously, at its own discretion.
What is Trustworthiness?
A property of an agent or organisation that engenders trust in another agent or organisation.
What is Trust?
A psychological state in which a person makes themselves vulnerable because they are confident that other agents will not exploit them.
What is an Adaptive Team of Agents (ATA)?
A homogeneous team that self-organizes a division of labor in situ, adapting to changing circumstances.
What is Emergent Behavior?
Behavior at the global level that was not programmed at the individual level and cannot be readily explained based on individual behavior.
What is the goal of a UAI agent?
Maximization of reward, subject to the agent’s beliefs about the environment.
What is Goal Reasoning?
The ability of an autonomous agent to reason about and dynamically adapt its goals, enabling adaptation to unexpected events or changes.
What is Social Planning?
Machine planning that incorporates an explicit model of the humans it interacts with, including their goals, intentions, and beliefs.